14 research outputs found
Towards Generating Large Synthetic Phytoplankton Datasets for Efficient Monitoring of Harmful Algal Blooms
Climate change is increasing the frequency and severity of harmful algal
blooms (HABs), which cause significant fish deaths in aquaculture farms. This
contributes to ocean pollution and greenhouse gas (GHG) emissions since dead
fish are either dumped into the ocean or taken to landfills, which in turn
negatively impacts the climate. Currently, the standard method to enumerate
harmful algae and other phytoplankton is to manually observe and count them
under a microscope. This is a time-consuming, tedious and error-prone process,
resulting in compromised management decisions by farmers. Hence, automating
this process for quick and accurate HAB monitoring is extremely helpful.
However, this requires large and diverse datasets of phytoplankton images, and
such datasets are hard to produce quickly. In this work, we explore the
feasibility of generating novel high-resolution photorealistic synthetic
phytoplankton images, containing multiple species in the same image, given a
small dataset of real images. To this end, we employ Generative Adversarial
Networks (GANs) to generate synthetic images. We evaluate three different GAN
architectures: ProjectedGAN, FastGAN, and StyleGANv2 using standard image
quality metrics. We empirically show the generation of high-fidelity synthetic
phytoplankton images using a training dataset of only 961 real images. Thus,
this work demonstrates the ability of GANs to create large synthetic datasets
of phytoplankton from small training datasets, accomplishing a key step towards
sustainable systematic monitoring of harmful algal blooms
Are Diffusion Models Vision-And-Language Reasoners?
Text-conditioned image generation models have recently shown immense
qualitative success using denoising diffusion processes. However, unlike
discriminative vision-and-language models, it is a non-trivial task to subject
these diffusion-based generative models to automatic fine-grained quantitative
evaluation of high-level phenomena such as compositionality. Towards this goal,
we perform two innovations. First, we transform diffusion-based models (in our
case, Stable Diffusion) for any image-text matching (ITM) task using a novel
method called DiffusionITM. Second, we introduce the Generative-Discriminative
Evaluation Benchmark (GDBench) benchmark with 7 complex vision-and-language
tasks, bias evaluation and detailed analysis. We find that Stable Diffusion +
DiffusionITM is competitive on many tasks and outperforms CLIP on compositional
tasks like like CLEVR and Winoground. We further boost its compositional
performance with a transfer setup by fine-tuning on MS-COCO while retaining
generative capabilities. We also measure the stereotypical bias in diffusion
models, and find that Stable Diffusion 2.1 is, for the most part, less biased
than Stable Diffusion 1.5. Overall, our results point in an exciting direction
bringing discriminative and generative model evaluation closer. We will release
code and benchmark setup soon.Comment: Accepted to NeurIPS 202
Score-based Diffusion Models in Function Space
Diffusion models have recently emerged as a powerful framework for generative
modeling. They consist of a forward process that perturbs input data with
Gaussian white noise and a reverse process that learns a score function to
generate samples by denoising. Despite their tremendous success, they are
mostly formulated on finite-dimensional spaces, e.g. Euclidean, limiting their
applications to many domains where the data has a functional form such as in
scientific computing and 3D geometric data analysis. In this work, we introduce
a mathematically rigorous framework called Denoising Diffusion Operators (DDOs)
for training diffusion models in function space. In DDOs, the forward process
perturbs input functions gradually using a Gaussian process. The generative
process is formulated by integrating a function-valued Langevin dynamic. Our
approach requires an appropriate notion of the score for the perturbed data
distribution, which we obtain by generalizing denoising score matching to
function spaces that can be infinite-dimensional. We show that the
corresponding discretized algorithm generates accurate samples at a fixed cost
that is independent of the data resolution. We theoretically and numerically
verify the applicability of our approach on a set of problems, including
generating solutions to the Navier-Stokes equation viewed as the push-forward
distribution of forcings from a Gaussian Random Field (GRF).Comment: 26 pages, 7 figure